111 research outputs found

    Construction de réponses coopératives : du corpus à la modélisation informatique

    Get PDF
    Les stratĂ©gies utilisĂ©es pour la recherche d’information dans le cadre du Web diffĂšrent d’un moteur de recherche Ă  un autre, mais en gĂ©nĂ©ral, les rĂ©sultats obtenus ne rĂ©pondent pas directement et simplement Ă  la question posĂ©e. Nous prĂ©sentons une stratĂ©gie qui vise Ă  dĂ©finir les fondements linguistiques et de communication d’un systĂšme d’interrogation du Web qui soit coopĂ©ratif avec l’usager et qui tente de lui fournir la rĂ©ponse la plus appropriĂ©e possible dans sa forme et dans son contenu. Nous avons constituĂ© et analysĂ© un corpus de questions-rĂ©ponses coopĂ©ratives construites Ă  partir des sections Foire Aux Questions (FAQ) de diffĂ©rents services Web aux usagers. Cela constitue Ă  notre sens une bonne expĂ©rimentation de ce que pourrait ĂȘtre une communication directe en langue naturelle sur le Web. Cette analyse de corpus a permis d’extraire les caractĂ©ristiques majeures du comportement coopĂ©ratif et de construire l’architecture de notre systĂšme informatique webcoop, que nous prĂ©sentons Ă  la fin de cet article.Algorithms and strategies used on the Web for information retrieval differ from one search engine to another, but, in general, results do not lead to very accurate and informative answers. In this paper, we describe our strategy for designing a cooperative question answering system that aims at producing the most appropriate answers to natural language questions. To characterize these answers, we collected a corpus of cooperative question in our opinion answer pairs extracted from Frequently Asked Questions. The analysis of this corpus constitutes a good experiment on what a cooperative natural language communication on the Web could be. This analysis allows for the elaboration of a general architecture for our cooperative question answering system webcoop, which we present at the end of this paper

    Learning Explicit and Implicit Arabic Discourse Relations.

    Get PDF
    We propose in this paper a supervised learning approach to identify discourse relations in Arabic texts. To our knowledge, this work represents the first attempt to focus on both explicit and implicit relations that link adjacent as well as non adjacent Elementary Discourse Units (EDUs) within the Segmented Discourse Representation Theory (SDRT). We use the Discourse Arabic Treebank corpus (D-ATB) which is composed of newspaper documents extracted from the syntactically annotated Arabic Treebank v3.2 part3 where each document is associated with complete discourse graph according to the cognitive principles of SDRT. Our list of discourse relations is composed of a three-level hierarchy of 24 relations grouped into 4 top-level classes. To automatically learn them, we use state of the art features whose efficiency has been empirically proved. We investigate how each feature contributes to the learning process. We report our experiments on identifying fine-grained discourse relations, mid-level classes and also top-level classes. We compare our approach with three baselines that are based on the most frequent relation, discourse connectives and the features used by Al-Saif and Markert (2011). Our results are very encouraging and outperform all the baselines with an F-score of 78.1% and an accuracy of 80.6%

    Commitments to Preferences in Dialogue

    Get PDF
    We propose a method for modelling how dialogue moves influence and are influenced by the agents’ preferences. We extract constraints on preferences and dependencies among them, even when they are expressed indirectly, by exploiting discourse structure. Our method relies on a study of 20 dialogues chosen at random from the Verbmobil corpus. We then test the algorithms predictions against the judgements of naive annotators on 3 random unseen dialogues. The average annotator-algorithm agreement and the average inter-annotator agreement show that our method is reliable.

    IDAT@FIRE2019: Overview of the Track on Irony Detection in Arabic Tweets

    Full text link
    [EN] This overview paper describes the first shared task on irony detection for the Arabic language. The task consists of a binary classification of tweets as ironic or not using a dataset composed of 5,030 Arabic tweets about different political issues and events related to the Middle East and the Maghreb. Tweets in our dataset are written in Modern Standard Arabic but also in different Arabic language varieties including Egypt, Gulf, Levantine and Maghrebi dialects. Eighteen teams registered to the task among which ten submitted their runs. The methods of participants ranged from feature-based to neural networks using either classical machine learning techniques or ensemble methods. The best performing system achieved F-score value of 0.844, showing that classical feature-based models outperform the neural ones.This publication was made possible by NPRP grant 9-175-1-033 from the Qatar National Research Fund (a member of Qatar Foundation). The findings achieved herein are solely the responsibility of the last author. The work of Paolo Rosso was also partially funded by Generalitat Valenciana under grant PROMETEO/2019/121.Ghanem, B.; Karoui, J.; Benamara, F.; Moriceau, V.; Rosso, P. (2019). IDAT@FIRE2019: Overview of the Track on Irony Detection in Arabic Tweets. CEUR-WS.org. 380-390. http://hdl.handle.net/10251/180744S38039

    Measuring the Effect of Discourse Structure on Sentiment Analysis

    Get PDF
    International audienceThe aim of this paper is twofold: measuring the effect of discourse structure when assessing the overall opinion of a document and analyzing to what extent these effects depend on the corpus genre. Using Segmented Discourse Representation Theory as our formal framework, we propose several strategies to compute the overall rating. Our results show that discourse-based strategies lead to better scores in terms of accuracy and Pearson’s correlation than state-of-the-art approaches

    Sentiment Composition Using a Parabolic Model

    Get PDF
    International audienceIn this paper, we propose a computational model that accounts for the effects of negation and modality on opinion expressions. Based on linguistic experiments informed by native speakers, we distil these effects according to the type of modality and negation. The model relies on a parabolic representation where an opinion expression is represented as a point on a parabola. Negation is modelled as functions over this parabola whereas modality through a family of parabolas of different slopes; each slope corresponds to a different certainty degree. The model is evaluated using two experiments, one involving direct strength judgements on a 7-point scale and the other relying on a sentiment annotated corpus. The empirical evaluation of our model shows that it matches the way humans handle negation and modality in opinionated sentence

    Evaluation in Discourse: a Corpus-Based Study

    Get PDF
    This paper describes the CASOAR corpus, the first manually annotated corpus that explores the impact of discourse structure on sentiment analysis with a study of movie reviews in French and in English as well as letters to the editor in French. While annotating opinions at the expression, the sentence or the document level is a well-established task and relatively straightforward, discourse annotation remains difficult, especially for non-experts. Therefore, combining both annotations poses several methodological problems that we address here. We propose a multi-layered annotation scheme that includes: the complete discourse structure according to the Segmented Discourse Representation Theory, the opinion orientation of elementary discourse units and opinion expressions, and their associated features. We detail each layer, explore the interactions between them and discuss our results. In particular, we examine the correlation between discourse and semantic category of opinion expressions, the impact of discourse relations on both subjectivity and polarity analysis and the impact of discourse on the determination of the overall opinion of a document. Our results demonstrate that discourse is an important cue for sentiment analysis, at least for the corpus genres we have studied

    Détection automatique de l'ironie dans les tweets en français

    Get PDF
    International audienceCet article présente une méthode par apprentissage supervisé pour la détection de l'ironie dans les tweets en français. Un classifieur binaire utilise des traits de l'état de l'art dont les performances sont reconnues, ainsi que de nouveaux traits issus de notre étude de corpus. En particulier, nous nous sommes intéressés à la négation et aux oppositions explicites/implicites entre des expressions d'opinion ayant des polarités différentes. Les résultats obtenus sont encourageants
    • 

    corecore